Automatic Document Summarization by Sentence Extraction

نویسنده

  • R. M. Aliguliyev
چکیده

Представлен метод автоматического реферирования документов, который генерирует резюме документа путем кластеризации и извлечения предложений из исходного документа. Преимущество предложенного подхода в том, что сгенерированное резюме документа может включать основное содержание практически всех тем, представленных в документе. Для определения оптимального числа кластеров введен критерий оценки качества кластеризации.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Biogeography-Based Optimization Algorithm for Automatic Extractive Text Summarization

    Given the increasing number of documents, sites, online sources, and the users’ desire to quickly access information, automatic textual summarization has caught the attention of many researchers in this field. Researchers have presented different methods for text summarization as well as a useful summary of those texts including relevant document sentences. This study select...

متن کامل

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

Results of CRL/NYU System at DUC-2003 and an Experiment on Division of Document Sets

We participated in three multi-document summarization tasks at the DUC-2003 formal run and evaluated the performance of our summarization system. Our summarization system based on sentence extraction also incorporated a module to estimate similarity between sentences for multi-document summarization. The similarity information was used for selecting the representative sentence among similar sen...

متن کامل

NTT/NAIST's Text Summarization Systems for TSC-2

In this paper, we describe the following two approaches to summarization: (1) only sentence extraction, (2) sentence extraction + bunsetsu elimination. For both approaches, we use the machine learning algorithm called Support Vector Machines. We participated in both Task-A (single-document summarization task) and Task-B (multi-document summarization task) of TSC-2.

متن کامل

Similarity-based Multilingual Multi-Document Summarization

We present a new approach for summarizing clusters of documents on the same event, some of which are machine translations of foreign-language documents and some of which are English. Our approach to multilingual multi-document summarization uses text similarity to choose sentences from English documents based on the content of the machine translated documents. A manual evaluation shows that 68%...

متن کامل

Centroid-based summarization of multiple documents: sentence extraction utility-based evaluation, and user studies

We present a multi-document summarizer, called MEAD, which generates summaries using cluster centroids produced by a topic detection and tracking system. We also describe two new techniques, based on sentence utility and subsumption, which we have applied to the evaluation of both single and multiple document summaries. Finally, we describe two user studies that test our models of multi-documen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009